Table of Contents

Review of Assignment 1 and 2
Download required Packages
Non-thresholded Gene set Enrichment Analysis
Visualizing Gene Set Enrichment Analysis in Cytoscape
Interpretation of the annotated network
Post Analysis

Intro / Review of Assignment 1 and 2

Expression Data set: GSE84054 link to the GEO page: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84054

In Assignment 1, the expression data set was read as a dataset in order to filter and normalize. This procedure was completed to have a sorted and simplified dataset for the analysis conducted in assignment 2 and 3.

In Assignment 2, differential gene expression analysis was conducted to identify the mahor up and downregulated pathways. The result demonstrated that both up and down regulated group were mainly associated with metabollic processes. This will be further explored in A3 using non-thresholded analysis and visualization using cytoscape.

Download required Packages

if (! requireNamespace("biocManager", quietly = TRUE)){
  install.packages("BiocManager", repos = "http://cran.us.r-project.org")
}
## 
## The downloaded binary packages are in
##  /var/folders/_c/59gypyts1hn8xzqlpwyxvlw80000gn/T//RtmpYUm5gf/downloaded_packages
if (!requireNamespace("knitr", quietly = TRUE))
  BiocManager::install("knitr")

if (!requireNamespace("GSA", quietly = TRUE))
  BiocManager::install("GSA")

if (!requireNamespace("RCurl", quietly = TRUE))
  BiocManager::install("RCurl")
if (! requireNamespace("ComplexHeatmap", quietly = TRUE))
  BiocManager::install("ComplexHeatmap")

if (! requireNamespace("gprofiler2", quietly = TRUE))
  BiocManager::install("gprofiler2")
if (!requireNamespace("edgeR", quietly = TRUE))
  BiocManager::install("edgeR")
if (!requireNamespace("circlize", quietly = TRUE))
  BiocManager::install("circlize")

library

library(BiocManager)
## Bioconductor version '3.12' is out-of-date; the current release version '3.14'
##   is available with R version '4.1'; see https://bioconductor.org/install
library(GSA)
## Warning: package 'GSA' was built under R version 4.0.5
library(RCurl)
## Warning: package 'RCurl' was built under R version 4.0.5
library(knitr)
## Warning: package 'knitr' was built under R version 4.0.5
library(data.table)
library(ComplexHeatmap)
## Loading required package: grid
## ========================================
## ComplexHeatmap version 2.6.2
## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
## Github page: https://github.com/jokergoo/ComplexHeatmap
## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
## 
## If you use it in published research, please cite:
## Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional 
##   genomic data. Bioinformatics 2016.
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(ComplexHeatmap))
## ========================================
library(gprofiler2)
library(edgeR)
## Loading required package: limma
library(circlize)
## Warning: package 'circlize' was built under R version 4.0.5
## ========================================
## circlize version 0.4.14
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
## 
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
##   in R. Bioinformatics 2014.
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(circlize))
## ========================================
library(Biobase)
## Loading required package: BiocGenerics
## Warning: package 'BiocGenerics' was built under R version 4.0.5
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following object is masked from 'package:limma':
## 
##     plotMA
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
##     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
##     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
##     union, unique, unsplit, which.max, which.min
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.

Non-thresholded Gene set Enrichment Analysis

gene set

# code retrieved from lecture slide
options(timeout = 1000)
gmt_url = "http://download.baderlab.org/EM_Genesets/current_release/Human/symbol/"
# list all the files on the server 
filenames = RCurl:: getURL(gmt_url)
tc = textConnection(filenames)
contents = readLines(tc)
close(tc)
# get the gmt that has all the pathways and does not include terms inferred 
# from electronic annotations(IEA) start with gmt file that has pathways only
rx = gregexpr("(?<=<a href=\")(.*.GOBP_AllPathways_no_GO_iea.*.)(.gmt)(?=\">)", contents, 
    perl = TRUE)
gmt_file = unlist(regmatches(contents, rx))
dest_gmt_file <- file.path("~/Desktop/bcb420/projects", gmt_file)
download.file(paste(gmt_url, gmt_file, sep = ""), destfile = dest_gmt_file)

Sort by rank

Necessary codes from A1 and A2

### From a1
nor_count <- read.table(file ="GSE84054_RLDNormalizedCount_12patientER.txt.gz", header = TRUE)

dt_type <- data.frame("Type"=1:24)
for (type1 in 1:12){
  dt_type$Type[type1] = "Primary Tumour"
}
for (type2 in 13:24){
  dt_type$Type[type2] = "Sphere"
}
rownames(dt_type) <- colnames(exp)[2:25]
### 

type_map <- nor_count[,3:ncol(nor_count)]
rownames(type_map) <- nor_count$EnsemblGeneID
colnames(type_map) <- rownames(dt_type)
modelDesign <- model.matrix(~ dt_type$Type)
head(modelDesign)
##   (Intercept) dt_type$TypeSphere
## 1           1                  0
## 2           1                  0
## 3           1                  0
## 4           1                  0
## 5           1                  0
## 6           1                  0
expressionMatrix <- as.matrix(nor_count[, 3:ncol(nor_count)])
rownames(expressionMatrix) <- nor_count$EnsemblGeneID
colnames(expressionMatrix) <- colnames(nor_count)[3:ncol(nor_count)]
minimalSet <- ExpressionSet(assayData = expressionMatrix)
fit <- lmFit(minimalSet, modelDesign)
#Apply empirical Bayes to compute differential expression
fit2 <- eBayes(fit, trend = TRUE) # trend=TRUE specific to RNA-seq data
topfit <- topTable(fit2, coef = ncol(modelDesign), 
                   adjust.method = "BH", 
                   number = nrow(expressionMatrix))
# merge hgnc names to topfit table 
output_hits <- merge(nor_count[,1:2], topfit, by.y = 0, by.x = 1, all.y = TRUE)

# sort by p-value 
output_hits <- output_hits[order(output_hits$P.Value), ]
### From a1
if (!file.exists("GSE84054_Rawcount_12patientER.txt.gz")){
  GSE84054 <- GEOquery::getGEOSuppFiles('GSE84054')
}
## Setting options('download.file.method.GEOquery'='auto')
## Setting options('GEOquery.inmemory.gpl'=FALSE)
fnames = rownames(GSE84054)
exp <- read.delim(fnames[2], header=TRUE, check.names = FALSE)
cpms = edgeR::cpm(exp[, 2:25])
keep = rowSums(cpms > 1) >= 3 # Dataset of 3 samples per group 
filtered_exp <- exp[keep,]
mart <- biomaRt::useMart("ensembl", dataset = "hsapiens_gene_ensembl")
## Ensembl site unresponsive, trying asia mirror
map_exp <- biomaRt::getBM(filters = "ensembl_gene_id", attributes = c("hgnc_symbol", "ensembl_gene_id"), values = filtered_exp[1], mart = mart)
copy_filter <- filtered_exp
copy_filter <- merge(map_exp, copy_filter, by.x = 2, by.y = 1, all.y = TRUE) 
### 

mat <- as.matrix(copy_filter[,3:26])
rownames(mat) <- copy_filter$ensembl_gene_id
d = edgeR::DGEList(counts = mat, group = dt_type$Type)
# estimate Dispersion 
d <- estimateDisp(d, modelDesign)
fit <- edgeR::glmQLFit(d, modelDesign)
qlf.pos_vs_neg <- edgeR::glmQLFTest(fit, coef = 'dt_type$TypeSphere')
head(qlf.pos_vs_neg)
## An object of class "DGELRT"
## $coefficients
##                 (Intercept) dt_type$TypeSphere
## ENSG00000000003   -9.644604         0.02426166
## ENSG00000000419  -10.142654         0.50644680
## ENSG00000000457  -11.227809        -0.21846437
## ENSG00000000460  -12.010001        -0.20673500
## ENSG00000000938  -12.264641        -3.06816274
## ENSG00000000971   -9.705494        -3.93967286
## 
## $fitted.values
##                     RHB037    RHB038    RHB041    RHB049     RHB050     RHB052
## ENSG00000000003 1197.69922 1375.9316 1295.7414 394.91545 1102.50688 1195.22286
## ENSG00000000419  727.81894  836.1274  787.3973 239.98257  669.97237  726.31410
## ENSG00000000457  245.82498  282.4068  265.9479  81.05548  226.28698  245.31672
## ENSG00000000460  112.38492  129.1092  121.5846  37.05650  103.45264  112.15255
## ENSG00000000938   87.09667  100.0577   94.2263  28.71824   80.17428   86.91659
## ENSG00000000971 1126.94072 1294.6434 1219.1907 371.58436 1037.37222 1124.61066
##                     RHB053    RHB350     RHB351     RHB353     RHB354
## ENSG00000000003 1184.23584 1505.0803 1106.28237 1331.31927 1200.10116
## ENSG00000000419  719.63750  914.6086  672.26667  809.01729  729.27855
## ENSG00000000457  243.06166  308.9142  227.06189  273.25019  246.31798
## ENSG00000000460  111.12160  141.2277  103.80691  124.92303  112.61031
## ENSG00000000938   86.11761  109.4494   80.44884   96.81352   87.27134
## ENSG00000000971 1114.27274 1416.1621 1040.92466 1252.66668 1129.20076
##                     RHB355     RHB031      RHB032      RHB035      RHB043
## ENSG00000000003 1152.09468 1564.26057 1563.284160 1553.259322 1745.820903
## ENSG00000000419  700.10593 1539.63780 1538.676762 1528.809723 1718.340224
## ENSG00000000457  236.46476  251.84159  251.684386  250.070416  281.072293
## ENSG00000000460  108.10567  116.48136  116.408649  115.662158  130.001096
## ENSG00000000938   83.78031    5.03736    5.034216    5.001933    5.622035
## ENSG00000000971 1084.03044   27.81928   27.801919   27.623634   31.048208
##                      RHB044      RHB046      RHB047      RHB344      RHB345
## ENSG00000000003 1592.230765 1621.769361 1751.834651 1967.842710 2075.471577
## ENSG00000000419 1567.167723 1596.241358 1724.259311 1936.867223 2042.801922
## ENSG00000000457  256.344709  261.100341  282.040489  316.817184  334.145131
## ENSG00000000460  118.564134  120.763701  130.448904  146.533764  154.548258
## ENSG00000000938    5.127432    5.222554    5.641401    6.337008    6.683603
## ENSG00000000971   28.316714   28.842038   31.155158   34.996711   36.910815
##                      RHB347      RHB348     RHB349
## ENSG00000000003 1882.070418 1963.532738 2060.08066
## ENSG00000000419 1852.445058 1932.625093 2027.65327
## ENSG00000000457  303.008084  316.123290  331.66724
## ENSG00000000460  140.146802  146.212826  153.40219
## ENSG00000000938    6.060797    6.323129    6.63404
## ENSG00000000971   33.471311   34.920061   36.63710
## 
## $deviance
## ENSG00000000003 ENSG00000000419 ENSG00000000457 ENSG00000000460 ENSG00000000938 
##       15.805069       10.822281        6.789848        5.987365       38.422946 
## ENSG00000000971 
##       98.649347 
## 
## $method
## [1] "oneway"
## 
## $unshrunk.coefficients
##                 (Intercept) dt_type$TypeSphere
## ENSG00000000003   -9.644690         0.02426371
## ENSG00000000419  -10.142795         0.50650305
## ENSG00000000457  -11.228228        -0.21856670
## ENSG00000000460  -12.010918        -0.20694567
## ENSG00000000938  -12.265829        -3.09288363
## ENSG00000000971   -9.705586        -3.94427959
## 
## $df.residual
## [1] 22 22 22 22 22 22
## 
## $design
##   (Intercept) dt_type$TypeSphere
## 1           1                  0
## 2           1                  0
## 3           1                  0
## 4           1                  0
## 5           1                  0
## 19 more rows ...
## 
## $offset
##          [,1]     [,2]     [,3]     [,4]     [,5]     [,6]     [,7]     [,8]
## [1,] 16.73285 16.87158 16.81153 15.62336 16.65003 16.73078 16.72154 16.96129
##          [,9]    [,10]    [,11]    [,12]    [,13]    [,14]    [,15]    [,16]
## [1,] 16.65345 16.83862 16.73485 16.69403 16.97559 16.97497 16.96854 17.08541
##         [,17]   [,18]    [,19]    [,20]    [,21]    [,22]    [,23]    [,24]
## [1,] 16.99332 17.0117 17.08885 17.20512 17.25837 17.16055 17.20293 17.25093
## attr(,"class")
## [1] "CompressedMatrix"
## attr(,"Dims")
## [1]  6 24
## attr(,"repeat.row")
## [1] TRUE
## attr(,"repeat.col")
## [1] FALSE
## 
## $dispersion
## [1] 0.6553415 0.6506241 0.8504457 1.1639535 1.5643274 0.6586076
## 
## $prior.count
## [1] 0.125
## 
## $AveLogCPM
## [1] 6.036774 5.712271 3.594160 2.484793 1.337520 4.960519
## 
## $df.residual.zeros
## [1] 22 22 22 22 22 22
## 
## $df.prior
## [1] 5.294182
## 
## $var.post
## ENSG00000000003 ENSG00000000419 ENSG00000000457 ENSG00000000460 ENSG00000000938 
##       0.6966812       0.5151580       0.3828555       0.3675714       1.5710027 
## ENSG00000000971 
##       3.7365814 
## 
## $var.prior
## ENSG00000000003 ENSG00000000419 ENSG00000000457 ENSG00000000460 ENSG00000000938 
##       0.6063778       0.6117160       0.6913020       0.7640833       0.8417328 
## ENSG00000000971 
##       0.6304249 
## 
## $samples
##                 group lib.size norm.factors
## RHB037 Primary Tumour 18491977            1
## RHB038 Primary Tumour 21243811            1
## RHB041 Primary Tumour 20005708            1
## RHB049 Primary Tumour  6097330            1
## RHB050 Primary Tumour 17022247            1
## 19 more rows ...
## 
## $table
##                       logFC   logCPM            F       PValue
## ENSG00000000003  0.03500217 6.036774  0.007727913 9.305912e-01
## ENSG00000000419  0.73064829 5.712271  4.536961562 4.233188e-02
## ENSG00000000457 -0.31517746 3.594160  0.874353891 3.579526e-01
## ENSG00000000460 -0.29825555 2.484793  0.594870312 4.471681e-01
## ENSG00000000938 -4.42642317 1.337520 17.013091531 3.126372e-04
## ENSG00000000971 -5.68374650 4.960519 25.107247559 2.876932e-05
## 
## $comparison
## [1] "dt_type$TypeSphere"
## 
## $df.test
## [1] 1 1 1 1 1 1
## 
## $df.total
## [1] 27.29418 27.29418 27.29418 27.29418 27.29418 27.29418
#get all the results 
qlf_output_hits <- topTags(qlf.pos_vs_neg, sort.by = "PValue", n = nrow(nor_count))

# merge gene names with the top hits 
qlf_output_hits_withgn <- merge(filtered_exp[,1:2], qlf_output_hits, by.x = 1, by.y = 0)
head(qlf_output_hits_withgn)
##             Var.1 RHB037       logFC   logCPM            F       PValue
## 1 ENSG00000000003   2561  0.03500217 6.036774  0.007727913 9.305912e-01
## 2 ENSG00000000419   1170  0.73064829 5.712271  4.536961562 4.233188e-02
## 3 ENSG00000000457    387 -0.31517746 3.594160  0.874353891 3.579526e-01
## 4 ENSG00000000460    217 -0.29825555 2.484793  0.594870312 4.471681e-01
## 5 ENSG00000000938     45 -4.42642317 1.337520 17.013091531 3.126372e-04
## 6 ENSG00000000971     76 -5.68374650 4.960519 25.107247559 2.876932e-05
##            FDR
## 1 0.9598094910
## 2 0.1207295397
## 3 0.5235473730
## 4 0.6064154567
## 5 0.0037633875
## 6 0.0006839251

Sort by rank

# Lecture 7 
df <- output_hits[output_hits$EnsemblGeneID %in% qlf_output_hits_withgn$Var.1,]

pVal <- df$P.Value
lFC <- df$logFC
df[,"rank"] <- log(pVal, base = 10)*sign(lFC)
df <- df[order(df$rank),]
write.table(x=subset(df, TRUE, c(symbol, rank)),
            file="rankedRNAseq.rnk", sep = "\t", 
            row.names = FALSE, col.names = FALSE, quote = FALSE)

Gene Set Enrichment Analysis

  1. What method did you use? What genesets did you use? Make sure to specify versions and cite your methods.

From many different gene set enrichment analysis tools, GSEA is used to conduct this analysis. GSEA (version 4.2.3) is downloaded as an application for the most up-to-date access to the tool and friendly GUI. The bader lab gene set and ranked geneset are used to run GSEA. Bader lab dataset is downloaded and ranked gene set are computed as above. These datasets are loaded on to the GSEA application. Load Data Then, I ran GSEA Pre-Ranked after adjusting the parameters as: * Number of Permutations: 1000 * No_Collapse * Max Size: 200 * Min Size: 15 Adjusting Parameters 2. Summarize your enrichment results

Overall Report: file:///Users/karenwon/gsea_home/output/apr06/my_analysis.GseaPreranked.1649226188805/index.html

Report of the result after runing GSEA ## Compare to A2 In A2 the thresholded analysis was conducted using g:profiler. According to the result, both up regulated and down regulated terms that were shown to be the major role were both relevant to metabollic processes. However, although the doenregulated group seem tobe similar, the non-thresholded analysis conducted using GSEA seem to demonstrate immune response related major terms for upregulated set. The two analysis indicates that the disease consist of interaction of both immune response and metabolic degradation which supports the paper and present its association with breast cancer.

Visualizing Gene Set Enrichment Analysis in Cytoscape

Create an enrichment map

The Enrichment Analysis result is visualized using the tool Cytoscape which presents the connection of genesets in common with edges and nodes. The Enrichment Map[@Enrichment] application is downloaded within the Cytoscape platform to easily use GSEA analysis result files.

The parameter settings to create an enrichment map are set as the image below: * FDR Q-Value: 0.05 * P-value: 1.0 * Edge Cutoff: 0.375

Creating Enrichment Map Initial enrichment map creation as an image: Initial map created using Cytoscape There are 1170 Nodes and 7142 edges presented in this map.

Annotate your network

The AutoAnnotate[@Autoannotate] application is used to annotate the network. The settings are adjusted as the image below: Auto Annotate Setting Adjutment The creation of annotation set will then create clusters of nodes indicating the frequency. We could see that there are a few large clusters of connected gene sets at the top of the map and several small clusters in the surroundings of the large sets. There are seperate genesets presented vertically at the bottom of the map. Initial map after auto Annotate ## Edit the map for visualization To Improve the presentation of the map the FDR cutOff is restricted further to 0.01. This results in 646 Nodes and 33395 Edges. The map after auto annotate result in much simpler diagram. Auto Annotate after restricting the cut-off to 0.01 To further organize and improve the presentation of the map, the “prevent cluster overlap” option is selected when setting up the auto annotate parameters. Setting to prevent cluster overlap Then, the resulting map is the image below: Map creation after cluster prevent # Publication ready figure / Collapse network to a theme network The Enrichment Map is simplified with major theme clusters. Some largest themes include Immune Regulation immunity, Immune Response cell, Regulation cell differentiation, and apc g1 degradation. There seem to be no clear novel pathways. Final Map after organizing

Interpretation of the annotated network

  1. Do the enrichment results support conclusions or mechanism discussed in the original paper? How do these results differ from the results you got from Assignment #2 thresholded methods?

According to the map the up-regulated genes reveal the most interaction with the immune response cell and it is connected to other major themes such as immune regulation, regulation cell differentiation, inferon gamma negative, etc. These main terms support the conclusion of the original paper as the immune system cells induces the regulation of the cancer growth. As high Immunoglobuline (IG), cancer-derived Ig are known to be present in cancer cells including carcinomas of breast, this also supports the reason why the major terms are present in the map and justifies the original paper. The major down-regulated term demonstrated in the map is the APC degradation. The APC plays a role in suppressing a tumor so the degradation suggests that the suppressor is inadequate in preventing the cancer cell growth. These similar terms (such as immune response) were also predicted using g:profiler in A2. Therefore, we can conclude that the major causation of the disease in this case is due to immune response malfunction.

  1. Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your result?

Research paper that support the result concluded in this study could be (Nestler et al., 2022) The paper states that the pathways relevant to immune system processes are highly visible and that most of the differentially expressed genes are also the immune related. Therefore concluded that the role of immune processes are significant in tumours. Another paper (Jin et al., 2021) adds to the evidence that their study revealed the immune cells (specifically TME immune cell in this case) functionally affected the progression of breast cancer. THe paper suggests that the TME of breast cancer is involved in multiple immunosuppression, which interacts with tumor cell metastasis.

Post Analysis

Add a post analysis to your main network using specific transcription factors, microRNAs or drugs. Include the reason why you chose the specific miRs, TFs or drugs (i.e publications indicating that they might be related to your model). What does this post analysis show?

The publication specifically explored the Pacritinib treatment as an actionable drug target in breast cancer. The ability of this drug was tested to research its potential inhibiting on tumour deriving cells. Therefore, I intended in adding a post analysis on this drug’s association with the network. However, the bader lab drug gene set does not contain this main drug that the original paper researched so another drug being discussed in the paper, gemcitabine, was chosen to be focused on. The paper states that the “patients were administered gemcitabine and carboplatin (not in bader lab drug set) for a maximum of siz cycles and patients stopped the treatment if they were deemed to have disease progression.” Therefore, we know that the drug was used for treatment id the cancer did not progress further. However, according to the post analysis, unlike the expectation, the Gemcitabine seem to have no connection in the disease pathway. Gemcitabin function to prevent fast-progressing cancer cell by targeting cell proliferation positive but this was not observed in the post analysis conducted.

Citation

Cui, M., Huang, J., Zhang, S., Liu, Q., Liao, Q., & Qiu, X. (1AD, January 1). Immunoglobulin expression in cancer cells and its critical roles in tumorigenesis. Frontiers. Retrieved April 7, 2022, from https://www.frontiersin.org/articles/10.3389/fimmu.2021.613530/full

Disis, M. L. (2010, October 10). Immune Regulation of cancer. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. Retrieved April 7, 2022, from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041789/

Goh, J. Y., Feng, M., Wang, W., Oguz, G., Yatim, S. M. J. M., Lee, P. L., Bao, Y., Lim, T. H., Wang, P., Tam, W. L., Kodahl, A. R., Lyng, M. B., Sarma, S., Lin, S. Y., Lezhava, A., Yap, Y. S., Lim, A. S. T., Hoon, D. S. B., Ditzel, H. J., … Yu, Q. (2017, September 25). Chromosome 1q21.3 amplification is a trackable biomarker and actionable target for breast cancer recurrence. Nature News. Retrieved April 7, 2022, from https://www.nature.com/articles/nm.4405

Jin, J., Li, Y., Zhao, Q., Chen, Y., Fu, S., & Wu, J. B. (2021, May 6). Coordinated regulation of immune contexture: Crosstalk between STAT3 and immune cells during breast cancer progression - cell communication and signaling. BioMed Central. Retrieved April 7, 2022, from https://biosignaling.biomedcentral.com/articles/10.1186/s12964-021-00705-2

Nestler, T., Dalvi, P., Haidl, F., Wittersheim, M., von Brandenstein, M., Paffenholz, P., Wagener-Ryczek, S., Pfister, D., Koitzsch, U., Hellmich, M., Buettner, R., Odenthal, M., & Heidenreich, A. (2022, January 12). Transcriptome analysis reveals upregulation of immune response pathways at the invasive tumour front of metastatic seminoma germ cell tumours. Nature News. Retrieved April 7, 2022, from https://www.nature.com/articles/s41416-021-01621-5

Steipe, B., & Isserlin, R. (Edited). (2022). R-basics https://bcb420-2022.github.io/R_basics/

Ruth Isserlin, Course Lectures (2022)